The present invention relates to computerized information retrieval systems and, in particular, to an automatic system for responding to queries of scientific papers comprised of text and figures.
Scientific research relies upon the communication of information through technical papers, typically published in peer-reviewed scientific literature. This communication has been facilitated by electronic databases of scientific literature that permit rapid retrieval and review of particular papers.
An important component of the most technical papers are figures, for example photographs, drawings, or charts in which text is incidental or absent. In many cases, the figures are the “evidence” of experiments. Frequently, the relevance or significance of the paper cannot be determined quickly without access to the images in the paper.
Despite the importance of images in technical papers, images are not well integrated into the document searching process. Images and, in particular, scientific images can be difficult to characterize automatically.